PARTIAL FREENESS OF RANDOM MATRICES 

JIAHAO CHEN AND ALAN EDELMAN 



orientations of eigenvectors. In the limiting case where the 
rotation matrix between the two bases is a random matrix of 
Haar measure, the matrices are said to be in generic position 
and the eigenvalue spectrum converges to the additive free 
convolution A EH B of two random matrices A and B as the 
matrix sizes increase to infinity.[i2] 

A natural question to ask is how accurately A EB B approx- 
imates the exact eigenvalue spectrum, or density of states 
(d.o.s.), of the sum A + B when the individual matrices are 
known to be noncommuting, but not necessarily free. We 
seek to quantify this statement in this paper. Our synthe- 
sis of existing results in the literature, together with a small 
number of new results, produces a coherent framework that 
provides insight into the inner workings of free probabil- 
ity on random matrices. We will also show that in addi- 
tion to helping classify two random matrices as being free 
or not free relative to each other, we can characterize them 
as having an intermediate, graduated property which we 
call partial freeness. Furthermore, we are able to quantify 
the leading-order discrepancy between freeness and partial 
freeness. This has already helped us explain the unexpected 
accuracy of approximations to the Hamiltonians of disor- 
dered condensed matter systems. [4] 

We begin with a brief, self-contained review of the free 
independence (i.e. freeness) from a random matrix theory 
perspective, and provide an elementary illustration of how 
computing the additive free convolution using an integral 
transform allows us to calculate the d.o.s. for the sum of 
free random matrices in a numerically exact manner. Next, 
we recap how the additive free convolution can also be ap- 
proached via the moments of random matrices, and in par- 
ticular how both classical and free independence can be in- 
terpreted as imposing precise rules for the decomposition of 
joint moments of arbitrary orders. We then define formally 
the notion of partial freeness and describe a procedure for 
detecting it numerically from samples of random matrices. 
Finally, we demonstrate with examples how partial freeness 
reveals insight into the optimal partition of a random matrix 
into the sum of two components such that their free convo- 
lution most closely resembles the original d.o.s. 

1. Freeness of two matrices 

We use the notation ( • } for the normalized expected trace 
(n.e.t.) Tr- of a N x N matrix. 

Definition 1. The random matrices A and B are free (or 
synonymously, freely independent) with respect to the n.e.t. 
if for all k e N, 

(1.1) ( Pl (A) qi(B) p 2 (A) q 2 (B) ■ ■ ■ p k (A) q k (B)) = 

for all polynomials p\, Cj\, p 2 , <?2/ ■ • ■ Pk> Ik such that (pi (A)) = 
(qi(B)) = • • • = O.Eol Definition 4.2] 

Fact 2. The preceding is equivalent to defining free independence 
using the special case of the centering polynomials p\ (x) = x n < — 
(x Hi ), qi(x) = x m < — {x m <), i = l,...,kfor positive integers 
n\, mi, . . . , Ufa »/• .I20 Proposition 4.3] 



Abstract. We investigate the implications of free prob- 
ability for random matrices. From rules for calculating 
all possible joint moments of two free random matri- 
ces, we develop a notion of partial freeness which is 
quantified by the breakdown of these rules. We pro- 
vide a combinatorial interpretation for partial freeness 
as the presence of closed paths in Hilbert space defined 
by particular joint moments. We also discuss how as- 
ymptotic moment expansions provide an error term on 
the density of states. We present MATLAB code for the 
calculation of moments and free cumulants of arbitrary 
random matrices. 
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in 

Free probability has received much attention in algebra 
t^J- since its discovery as an algebraic structure for noncommut- 
C> ing operators. [20] Subsequently, it has also found a place 
in combinatorics with the relationship between free cumu- 
r 7*. lants and noncrossing partitions JI12I (13! Its implications for 
random matrices, however, have not been as well studied. 
k> The discussion of free probability for random matrices usu- 
j_j ally centers around the onset of asymptotic freeness for cer- 
tain random matrices in the infinite limit. [22, 2] By contrast, 
the situation for finite random matrices have not been stud- 
ied much. This has motivated us to study the applicability 
of free probabilistic ideas to random matrix theory, concen- 
trating in particular on its numerical and computational as- 
pects. 

In this paper, we provide a random matrix theoretic per- 
spective on free probability theory, hoping that this will be 
accessible to readers familiar with linear algebra and ele- 
mentary statistics, without requiring intricate knowledge of 
operator algebras or combinatorics. We illustrate the ap- 
plication of free probability to a specific problem, namely 
the calculation of the eigenvalues of sums of matrices. In 
general, the eigenvalues of the sum of two matrices A + B 
are not the sum of the eigenvalues of the individual matri- 
ces A and B;j8j as matrices do not generally commute, the 
addition of eigenvalues must take into account the relative 
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This fact necessarily follows from the definition of the 
centering polynomials. That this is sufficient follows from 
the linearity of the n.e.t. Thus the two statements are equiv- 
alent. 

In principle, two matrices A and B can be checked for 
freeness by checking that all centered joint moments 

(B'"i-(B m i}) ■ ■ ■ (A nk -(A nk )) (B m <=-(B m *))) 

vanish for all positive exponents ri\,m\,. . .,n k ,m k . However, 
this is numerically impractical due to the need to check joint 
moments of all orders, as well as the presence of fluctua- 
tions from sampling error, which causes the higher order 
joint moments to converge more slowly with increasing de- 
gree. In practice, it is far easier to check that the d.o.s. of the 
exact sum A + B converges to the p.d.f. defined by the free 
convolution A EB B,[2i] which we will now define. 

1.1. The free convolution. 

Definition 3. The R-transform of the p.d.f. f^, denoted by 
Ra, is defined via the Cauchy transform: li2Tl IT2II 

fA(z) 



(1.2) 



w = lim 



-dz 



40 Jr R^ 1 (w) - (z + ie) 

where R^ 1 is the functional inverse of Ra in the usual sense, 

i.e. R^ 1 (Ra {to)) = Ra (^a 1 = w - Some intuition for 
the R-transform may be achieved by expanding the Cauchy 
integral as a formal power seriesj^] 



(1.3) G A O) = lim / 



Ia{z) 



-dz 



k=0 



w 



k+1 



l R w — z — ie 
where u k is the fcth moment 

(1.4) Fk {A)= J x k f A {x)dx=(A k ) 

In other word, the Cauchy transform of a p.d.f. is a gen- 
erating function of its moments. We then have that the R- 
transform Ra complements the Cauchy transform Ga in the 
sense that 



(1-5) 



R/ 



w 



G A (to). 

The free cumulants v k are the coefficients produced by 
the expansion of the R-transform as a formal power series 
in 1/w, i.e. 



(1.6) 



Ra H = E 



k=0 



k+1 



W 



with v k = 1. The free cumulants are particular combinations 
of moments v k = v k {y.\, . . . , u k ) which will be made more 
explicit later. 

Definition 4. The free convolution A EB B is defined via its 
R-transform: 

1 



(1-7) 



Rabb (h>) = R A {w) + R B {w) 



I Physicists may recognize Ga (iv) as the retarded Green function corre- 
sponding to the Hamiltonian A. 



The free cumulants can be seen as linearizing the free 
convolution, lii2l [13] in the sense that for all k > 0, 

(1.8) v k {ASB) =v k {A)+v k {B) 

and the subtraction ofl/zv produces a properly normalized 
p.d.f. by conserving v {A EB B) = fi (A EB B) = 1. 

we show an example of calculating /asb 



In Section 



1-3 



analytically via the R-transform. In general, such analytic 
calculations are hindered by the functional inversions re- 
quired in | |i.2| . This has inspired interesting work in calcu- 
lating A EB B numerically, such as in the RMTool package. [[14) 
We discuss instead an alternate strategy starting directly 
from numerical samples of random matrices. In situations 
where only the numerical samples are known, it may be 
convenient instead to use the Result described in the next 
section. 

1.2. Free convolution from random rotations. 

Definition 5. A square matrix Q is a unitary /orthogonal/ 
symplectic random matrix of Haar measure if for any con- 
stant unitary/ orthogonal /symplectic matrix P, the integral 
of any function over dQ is identical to the integral over 
d (PQ) or that over d (QP). 

Example 6. Unitary matrices of dimension N = 1 are simply 
scalar unit complex phases of the form e ,e . Haar measure 
over e lB can be written simply as d8/2n. This is manifestly 
rotation invariant, as multiplying e' e by any constant phase 
factor e 1 ^ simply changes the measure to d{6 + (p)/2n = 
de/2n. 

Haar measure generalizes the concept of uniformity to 
higher dimensions by preserving the notion of invariance 
with respect to arbitrary rotations. Consequently, the eigen- 
values of Q lie uniformly on the unit circle on the complex 
plane. [6 1 Explicit samples can be generated numerically by 
performing QR decompositions onNxN matrices sampled 
from the Gaussian orthogonal (unitary) ensemble. f5l 

Fact 7. For a pair ofHermitian (real symmetric) random matrices 
A and B, the d.o.s. of A + QBQ + , where Q is a unitary (orthog- 
onal) random matrix of Haar measure, coincides with the p.d.f. of 
A EB B in the limit of infinitely large matrices N — > 00. 

Consider the diagonalization of A — Qa^aQ\ an< ^ ^ = 
Qb^bQb- The d -°- s - of A + QBQ + is identical to that of 
A A + {Q a QQb) a b (QIQ^Qa), since these matrices are re- 
lated by the similarity transformation (•) Qa- However, 
the Haar property of Q means that the d.o.s. of this matrix 
is identical to that of + QAgQ + . This affords us another 
intepretation of free convolution: it describes the statistics 
resulting from adding two random matrices when the basis 
of one matrix is randomly rotated or "spun around" relative 
to the other. The information about the relative orientations 
of the two bases is effectively ignored. 

The freeness of random matrices is often discussed in the 
context of it emerging only in the limit of infinitely large ma- 
trices, i.e. N — » 00, where it is called asymptotic freeness JI22I 
Perhaps the most well-known example is that two matrices 
sampled from the Gaussian ensembles (orthogonal, unitary 
or symplectic) are free. [12] Nevertheless, finite-dimensional 
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random matrices can exhibit freeness as well. We now pro- 
vide some examples and illustrate the analytic calculation 
of the free convolution using the R-transform. 

1.3. Examples of free finite-dimensional matrices. 

Example 8. Consider the 2x2 real symmetric random ma- 
trices 

(1.9a) A{t) = U(t)a z U{-t) 

(1.9b) B(t) — U (— f) <x z U (f) 



where c z is the Pauli matrix 





-1 



and U (f ) is the 



rotation matrix 



sin t 



, and the rotation angle t ±L 



cos t 

— sin t cos t 

is uniformly sampled on the interval [0, 71] . By construction, 
the d.o.s. of A (t) and B (f) are identical; their eigenvalues 
have the p.d.f. 

(1.10) f A (x) = f B (x) = l(s(x + l)+6(x- 1)) 

where 5 (x) is the Dirac delta distribution. Furthermore, for 
any particular t, the sum of A (t ) and B (t) can be written in 
the basis where A (t ) is diagonal as 

(1.11) M (t) = a z + U (-2t) a z U (2t) 

By construction, U (2f ) is of uniform Haar measure and so 
the d.o.s. of M (f ) is given exactly by the additive free con- 
volution of A (f) and B (t). \\22l [12! The K -trans forms of j a 
and fg are 

+ VT 



(1.12) 



R A (w) = R B (w) 



■Aw 2 



2w 



where we have taken only the positive root so that that the 

R-transforms remain nonnegative. We then obtain 

(1.13) 

Rasb (™) = Ra M + R B (w) 



1 

w 



Aw 2 



2w 



Finally, we calculate the p.d.f. using the Plemelj inversion 
formula: 

1 



(1.14a) 
(1.14b) 



/abb (*) 



7T 



Im R A ^ 
1 



w 



TlVT 



which is the arcsine distribution on the interval [—1,1]. The 
odd moments all vanish by the even symmetry of /aE5B/ 
and the even moments are the central binomial coefficients 

F2 „(AfflB)= 

This example illustrates how the free convolution of two 
discrete probability distributions can be a continous proba- 
bility distribution. It is useful to contrast this result with the 
classical convolution A ★ B, 



(1.15a) 
(1.15b) 



Ia*b 0) = /a */b = / /a (y) /b (x - y) dy 



- {6{x + 2)+25 (x) + 8{x- 



2)) 



which is simply a discrete binomial distribution. The results 
of the two convolutions are plotted in figure |i.i| 

Perhaps surprisingly, freeness can also be a property be- 
tween two finite, deterministic matrices. 



Figure 1.1. The d.o.s. /abb ( x ) and /a*b{ x ) 
for the free (solid black line) and classical 
convolutions (dashed blue line) of the ma- 
trices in Example 1, as given in 1 1.14b! and 
< |i.i5b| respectively. The heights of the lines 
in the plot of /a*b indicate the point masses. 




Example 9. The 2N x 2N deterministic matrices 



/ 1 
1 



(1.16a) 



1 

1 



/ 



(1.16b) B 

are free for all N 
Pauli matrix a x = 



1 

1 



1 \ 



7 



V 1 

> 2. Each consists of N copies of the 

1 

1 



, with B having a basis shifted 



relative to A with circulant (periodic) boundary conditions. 
The d.o.s. are the same as in the previous example and so 
the calculations of A EB B and A ★ B proceed identically. 

The matrix B', being B without the circulant entries on the 
lower left and upper right, is not free relative to A. However, 
A and B' are still asymptotically free as N — » 00. 

1.4. Comparison of free and classical convolutions. We con- 
clude this introductory survey by fleshing out the analogy 
between the free additive convolution EB and the classical 
convolution *. We note that the Fourier transform ~ turns 
classical convolutions into products according to the convo- 
lution theorem, i.e. /a */b = /a/b/ furthermore, taking the 
logarithm allows this to be written as a linear sum: 

(1.17) log/^ */b = log /a + log/fj 

Furthermore if j a and /g are p.d.f.s, then log/,4 and 
log fs can be identified as the corresponding cumulant gen- 
erating functions, i.e. log f a can be expanded in a formal 
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power series 

(1.18) logf2 (w) 



E 

k=0 



K n {A) W n 



IV. 



where k„ (A) is the nth cumulant of /a, and similarly for 
f B . We have also have that k„ (A * B) = K n (A) + k„ (B) for 
n > 1, i.e. that cumulants linearize the convolution. 

For this reason, the i?-transform is often described as the 
free analogue of the log-Fourier transform, with the Cauchy 
transform being the analogue of the Fourier transform and 
the functional inversion playing the part analogous to the 
logarithm. Naturally, the free analogue of the cumulant K n 
is the free cumulant v n . 

Finally, we note that the sum of two random variables 
a and b, sampled with p.d.f.s f& and fg respectively, itself 
has the resulting p.d.f. /A*B-lj7j If we have matrices A and 
B with d.o.s. /a and fg respectively, then the matrix A + 
nBFI T , where FI is a random permutation matrix, has d.o.s. 
f A */b- This is equivalent to the p.d.f. formed by picking 
an eigenvalue of A at random and an eigenvalue of B at 
random. In this sense, the discrete random permutation FI 
which generates the classical convolution is the analogue of 
the continuous random rotation Q of uniform Haar measure 
in free convolution. 

Table [1] summarizes the analogies between the free and 
classical convolutions. 

Table 1. Comparison of the free and classi- 
cal convolutions. 



AfflB 


A-kB 


R-transform R 


log-Fourier transform log/ 


Cauchy transform G 


Fourier transform / 


functional inversion 


logarithm 


free cumulants v n 


(classical) cumulants k„ 


Plemelj inversion 


inverse Fourier transform 


A + QBCf 


A + TIBTl 1 


uniform Haar measure Q 


random permutations FI 



2. D.O.S. OF SUMS OF RANDOM MATRICES 

We now consider general N x N random matrices A and 
B which are not necessarily free, and investigate moments 
of the sum, u n (A + B). The moments {ji n } capture all the 
information contained in the corresponding p.d.f., provided 
that they all exist. For random matrices, this simply means 
that all powers of the matrix must have a finite n.e.t. For 
simplicity, we assume in the rest of this paper that all neces- 
sary moments exist. 

Recall that the primordial definition of freeness con- 
cerns itself with joint moments of A and B. We will now 
show how each moment of A + B subdivides into sums over 
joint moments of A and B, and how this affords us fine-grain 
detail into the inner workings of free and classical indepen- 
dence, and whether or not they are obeyed by the matrices. 

2.1. Calculating moments of A + B from their joint mo- 
ments. We consider the expansion of the d.o.s. of the exact 
sum A + B in terms of the moments {}i n (A + B)}. From 



the definition of the moment of a random matrix, these mo- 
ments are given by 

(2.1a) 

u n = ((A + B)") 
(2.1b) = (A' 1 + A n ~ l B + A"- 2 BA + ■ ■ ■ + B"' 



(2.1c) 



(A n ) 



A n ~ 1 B 



A n 



-2 B 2 



(B n ) 



where the second equality follows directly from the non- 
commutative binomial expansion of 1 2.1b I, and the third 
equality follows from the linearity of (•) and its cyclic in- 
variance property, i.e. (AB) = (BA). 

We refer to < |2.ic) as the word expansion of u„. As written, 
there are 2" terms in 1 2.1b} but some of them yield the same 
n.e.t. in | |2.ic} identically because of cyclic invariance. It 
turns out that the equivalence classes defined by grouping 
identical terms in this manner are exactly those of combina- 
torial necklaces. f9] [15) 

Definition 10. A (n, A:)-word W is a string of n symbols, each 
of which can have any of k values. A (n, k) -necklace [J\f\ is 
the equivalence class over in, k) -words W with respect to 
cyclic permutations FI of length n, i.e. 

(2.2) [Af] = {w e W|37r G II : N = tzw} 

We note that there are efficient algorithms for enumerat- 
ing all (n, k) -necklaces for a given n and fc.fi6l \T% Further- 



more, the total number of terms in the word expansion 1 2.1c I 
is well-known: 

Fact 11. The number of (n,k) -necklaces is 

(2.3) N(n,k) = 1 -Y.'P( d ) kn,d = lt k8Cd{hH) 



d\n 



where d\n means that d divides n, <p is the Euler totient function, 
gcd is the greatest common divisor.^ [13J By definition, (p (d) 
is the number of integers m in the range 1 < m < d that are 
relatively prime to d, i.e. gcd (d,m) = 1. 

In addition, we can determine the multiplicity of each 
term in | |2.ic) , which is identical to the number of words in 
the equivalence class defined by each corresponding partic- 
ular necklace. We state this very simple result without proof 
and provide an example. 

Fact 12. Let m = # ([A/ - ]) be the number of (n,k)-words belong- 
ing to the equivalence class that defines the necklace [J\f]. Then m 
is the length of the longest cyclic permutation that leaves any word 
W G M unchanged, i.e. it is the length of the longest subword S 
of a word W G Af such that W = S" /m . 

Example 13. The necklace [AABAAB] = [A 2 BA 2 B] is an 
equivalence class over (6, 2)-words of size 3, since apply- 
ing a (one-symbol) cyclic permutation three times leaves 
A 2 BA 2 B unchanged: 

(2.4) AABAAB 1 — ^ ABAABA ^ BAABAA ^ AABAAB 

i.e. # ( [A 2 BA 2 B] ) = 3 which follows from the fact that 
AABAAB = (AAB) 2 . 

The algorithm for the enumeration of necklaces, together 
with knowledge of their multiplicities, allows us to sum the 
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joint moments in the word expansion | |2.ic) to obtain fi n . We 
now proceed to calculate each of these joint moments. 

2.2. Decomposition rules for joint moments. We have now 
reduced the problem of calculating }i n to that of calculat- 
ing joint moments; each has the form (A ni B m ■ ■ ■ A" k B mk ) 
for positive integers n\, m\, . . ., n^, m^. We can understand 
classical and free independence as prescriptions for comput- 
ing such joint moments in terms of the pure moments (A), 
(A 2 ), . . ., and (B), (B 2 ), ... of A and B respectively. 

Fact 14. For classically independent random matrices A and B, 
we have 

(2.5a) (A" 1 B"' 1 ■ • • A' lk B mk ) = (A"i+-+"*B mi+ - +m *) 
(2.5b) = (B m+ - +mk ) 

i.e. A and B behave as if they commute. [[12I 

The analogous rule for free independence is somewhat 
more complicated, but can be derived from the primordial 
definition of freeness. 

To cast the decomposition rule for free independence in 
the same form as that for classical independence in 12.5b!, 
we can use again the linearity of the n.e.t. to expand the 
joint moment 

= ((A"i -(A" 1 ))(B ffl i -(B'"i))--- 

(2.6a) x (A" k - (A" k )) (B mk - (B mk ))) 

= (A"iB ffll ■ ■ ■ A nk B"' k ) 

- (A" 1 ) (A" 2 B m2 ■ ■ ■ A" k B nik+mi ) 

- (B mi ) (A" 1+n2 B n ' 2 ■ ■ ■ A" k B mk ) 

(2.6b) + (-i)£/=i(";+™;) (A" 1 } ■ ■ ■ (A" k ) (B m i) ■ ■ ■ {B n ' k ) 

which can be rearranged immediately to give a recurrence 
relation for the joint moment (A ni B mi ■ ■ ■ A" k B r " k ) in terms 
of lower order joint moments. 

3. Partial freeness 

We now present our new main concept, namely a pro- 
posal for defining partial freeness. 

Definition 15. Two random matrices A and B are partially 
free of order p if the first statistically significant difference 
between A + B and A EB B occurs at the pth moment }ip, i.e. 
1 2.6b} holds for all joint moments of the form 

(A"iB'"i ■ • • A" k B mk ) 

with positive integers n\, m\, . . . ,n k , YJl-\ ( n i + m i) — <?/ ror 
all q < p, but there exists at least one joint moment for q = p 
for which 1 2.6b I does not hold. We will say that A and B are 
free to p moments. 

We have immediately: 
Fact 16. Let A and B he a pair of N x N diagonalizable ran- 
dom matrices with A = Qa^aQ\ an ^ ^ = Qb^-bQb- If 
E (QgQ^) ( y = 1/N for each matrix element of Q\Qa, then 
A and B are free to p > 3 moments. 

This is a restatement of the matching three moments the- 
orem of Refs. liToHTTl . 

Our definition is a natural generalization of the concept 
of freeness, and they coincide if all the moments match. 



Claim 17. Two random matrices A and B are free if they are 
partially free to all orders p — » 00. 

This follows immediately from the definition of partial 
freeness above and the definition of freeness in 

Example 18. The N x N random matrix A, a diagonal ma- 
trix with elements i.i.d. standard Gaussian random variates, 
and B, the tridiagonal matrix 

/ 1 \ 



1 



1 

/ 



V 1 

are partially free of order 8. 

This can be verified by explicit calculation of the first 
eight moments. Again by even symmetry all the odd mo- 
ments of A + B vanish while even moments are 1, 3, 17, 125, 
1099, 11187, 12 9759/ • ■ ■ Furthermore we can identify the 
leading order deviation as arising from the term / (AB) 4 ^} = 
1. 

This example illustrates that (partial) freeness can also be 
a property of a pair of random and deterministic matrices. 

Since free probability affords us a method for calculating 
the d.o.s. of the sum of two matrices A + B using the free 
convolution A EB B when A and B are free, it is natural to 
ask how good an approximation the free convolution A EB B 
is to the d.o.s. of the sum A + B when A and B are not free, 
but only partially free. We have proposed to quantify this 
statement using asymptotic moment expansions,[4l which 
we will describe in the next section. 

4. Distinguishing between two distributions using 
asymptotic moment expansions 

We have described partial freeness in terms of discrep- 
ancies in the moments between the exact p.d.f. and that 
from the free convolution. This suggests to us that a natural 
framework for expressing the difference between partially 
free matrices and (completely) free matrices can be found 
in the setting of asymptotic moment expansions,)^] which 
are parameterized naturally using the moments (or cumu- 
lants) of the two distributions being compared . The two 
standard approaches are the Gram-Charlier series of Type 
A and the Edgeworth series,[i8J in which a probability den- 
sity / is written as an expansion about another, reference 
probability density f . 

4.1. The Gram-Charlier Type A series. The Gram-Charlier 
series arises immediately from the orthogonal polynomial 
expansion with respect to / as the weight: [[19) Chapter IX] 



(4-i) 



f( x )=J2 C "^n ( X ) / ( X ) 
n=0 



where the coefficients can be shown, by the orthonormality 
of the orthogonal polynomials, to be 

(4-2) 

/ <p m (x) f (x) dx=Yj C « / 4>m (*) 4>n (*) / (*) dx = C m 
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i.e. the with coefficient is the expected value of the with or- 
thogonal polynomial with respect to the probability density 
/. By expressing the orthogonal polynomials in the mono- 
mial basis, 

m 

(4.3) <p m (x) = ]T) a mk x k 



(44) 



E a mk \ xk f 0) dx = E "mkFk 
k=0 J]R k=0 



we get an explicit expansion of the Gram-Charlier coeffi- 
cients as linear combinations of the moments ji^ of /. The 
Gram-Charlier Type A series is simply a special case of the 
orthogonal polynomial expansion just described as special- 
ized to the case of a Gaussian weight: 



4.3. Deriving the Gram-Charlier series from the Edgeworth 
series. Rederiving the Gram-Charlier form from the Edge- 
worth series reveals additional interesting relationships. One 
such relation follows from the identity 
(4.11) 

ex p ( E —t"] = E B "( fl1, V ' = v gn f 

\n=\ U - J n=0 n - n=0 n - 

where B„ is the complete Bell polynomial of order n.[i] Set- 
ting t = —d/dx gives immediately 

r,,)-«p(£^(-!)-) 
b„ {{k„-k„}) ( d_y 

dx 



(4-13) 



m 



(4-5) 



/» = <&(*) 



I In 



exp 



We will call < |4.i3[ | the direct series of T. 

Futher specializing again to the Gaussian reference, we 
can use Rodrigues's formula 



in which the corresponding family of orthogonal polynomi- 
als {<p n } is simply the (probabilist's) Hermite polynomials, 
conventionally denoted by { He n } . 

4.2. Edgeworth series. The Gram-Charlier series comes from 
applying the operator 



(4.6) 



T ( x ) = E Cn< P n (*) 

n=0 



that tranforms / into / = Tf. The Edgeworth series is de- 
rived by rewriting T as a differential operator, as can be 
derived using the relations between a probability density, 
its characteristic function %(x) and the moment generating 
function, and its cumulant generating function: 



(4.7) X (t)= f e itx f(x)dx = E f{ 
(4.8) 



jtx 



n=0 \n=l 



Writing down the analogous relations for f and dividing 
yields 



(4.14) He B (z)«D(z) = (-l) H (^J 
so that 

(4.15) T (x) d> (x) = £ Bn{{Kn 7* n}) He n (x) d> (x) 

n=0 n - 

The first few coefficients c n = B n ({k„ — k„}) have been 
tabulated explicitly! 18 1 but to our knowledge the relation- 
ship to the Bell polynomials have not been previously dis- 
cussed in the literature. 

4.4. Quantifying the effect of differing moments. The Edge- 
worth series yields a useful result for error quantification. If 
the first k — \ moments of two p.d.f.s / and / are the same, 
but the fcth moments differ, then the leading term in the 
Edgeworth series is 

(4.16a) 

f(x)=f(x) + 
(4.16b) 



(-If B* ({«*-«*}) ~r 



/(*) ( 3e ) + o(/( k+1 )) 



/(*) + 



{-lTiH-fik)* 



jfc! 



4.5. 



The locus of differences in moments. The word ex- 



(4-9) 



vn=l 



which, after rearrangement and taking the inverse Fourier 
transform, yields 



pansion 1 2.1c I allows us to refine the error analysis in terms 
of specific joint moments that contribute to ^ — pj-. It is in- 
structive to interpret each term in | |2.ic) , being a trace of a 
product of k matrices, as a closed path with up to k hops as 
allowed by the structure of the matrices being multiplied. 




As before, the Edgeworth series is usually presented for the 
Gaussian case / = <!>. Although these series are formally 
identical, they yield different partial sums when truncated 
to a finite number of terms. The Edgeworth form is gen- 
erally considered more compact than the Gram-Charlier se- 
ries, as only the former is a true asymptotic series. lli8"l [^J 



Example 19. Consider A and B as in Example 18 which 
are partially free of degree 8 and whose discrepancy in the 
eighth moments relative to complete freeness is in the term 

(^(AB)^. Note that B is exactly the adjacency matrix of 
the one-dimensional chain ■ — ■ — — ■ — ■ with N nodes 
and periodic boundary conditions, and we can interpret 

4\ 



(AB) J as corresponding to specific paths on this lattice. 
These paths must have exactly four hops, as A, being di- 
agonal, does not permit hops, whereas B, having nonzero 
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entries only on the super- and sub-diagonals, require ex- 
actly one hop either to the immediate left or the immediate 
right. This gives rise to four different paths as illustrated 
in Figure 4.1 The paths neatly represent the terms arising 



Figure 4.1. Paths contributing to the term 
(AB) 4 \ in Example 
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(ABf 



+ 



+ 










from decomposing the joint moment into moments of scalar 
i.i.d. standard Gaussians g u i.e.: 

(4.17a) ((AB) 4 ^} = (gigi-igigi+i) + (gigi+igigi-i) 

+ (gigi-lgigi-l) + (gigi+igigi+i) 
(4.17b) =2E ( gi ) 2 E (g?) + 2E (g?) 2 = 2 

5. Computational implementation 

5.1. Numerical calculations on empirical samples. The for- 
malism we have developed can be applied directly to empir- 
ically sampled pairs of random matrices without knowledge 
of their underlying distributions. To demonstrate this, we 
have developed a proof of concept tool and implemented it 
in a MATLAB program: 

(1) Generate f pairs of samples {(Aj, B,)}/=i oi n x n 
Hermitian random matrices A and B. 

(2) For each pair: 

(a) Calculate the exact eigenvalues of A,- and B;. 

(b) Calculate the eigenvalues of the sample A,- + B, 
of the exact sum A + B. 

(c) Calculate n samples of the free convolution /abb 
using the eigenvalues of M; = A, + Q;B;Q* us- 
ing a numerically generated Haar orthogonal 
(or unitary) matrix Q ( . 

(d) Calculate n samples from the classical convolu- 
tion /a+b using the eigenvalues of L, = A, + 
Flr 1 B ! TI ! - using a random permutation matrix 

n f . 

(3) Calculate the first 2K moments of A and B as well 
as the first K moments of \i k (A EB B), ft (A * B), and 

ft(A + B). 

(4) Calculate the degree k for which A and B are par- 
tially free by testing for the smallest k such that the 
moments of the free convolution differ from the ex- 
act result, i.e. test the hypothesis ft (A + B) ^ ft (A B 

(5) Use Sawada's algorithmic] to enumerate all unique 

terms 7) = (A m VB n V ■ ■ ■ a"W^ in ((A + B 
For each term T,-: 



•+"*/; 



(a) Calculate 

T (cl) = ^ A mt j +-+m k .^ ^ B n y - 

which would be its value expected from classi- 
cal independence. Test the hypothesis of equal- 
ity Tj = T^ d \ 

(b) Calculate the normalized expected trace of the 
centered term 

•Ac) = ^ A m tj _ ( A my)) ( B "y _ . . . ) 

which would be expected to vanish if A and B 
were truly free. Test the hypothesis of equality 



<c) 



0. 



(k) 

(6) Calculate the kth derivative f\mB usm g numerical 
finite difference. 

(7) Plot f A+B , f A m and f A mB + in ~ h) /kl ■ f A k m- 
This algorithm tests for partial freeness of degree k < K, 
attempts to identify the locus of discrepancy by testing all 
possible (A:, 2) -words, and calculates the leading order cor- 
rection term to the density of states. In practice, we account 
for sampling error in the hypothesis tests in Steps 4 and 5 
by calculating the standard error of each term being tested, 
and evaluating the p-value for each such hypothesis. For 
example, the standard error of the kth moment is 



(5-i) 



SE (pi k ) 



4 



and the standard error of a term T; in the expansion of 



{A + B) k is 



(5.2) SE (Tj) = \ 



(( 



■ A K i'B K j> 



Tf 



The calculation of necessary standard errors require in- 
formation up to the 2Kth moment for calculating variances 
stemming from the Kth moment, which is why 2K moments 
of A and B are calculated in Step 3. 

5.2. Symbolic computation of moments and joint moments. 

If the analytic forms of the random matrices A and B are 
known, a computer algebra system can be used to calculate 
the necessary moments symbolically. Some examples of the 
calculation of the moments and joint moments in Mathe- 
matica is shown in Algorithm Q. The code provides simple 
functions for calculating normalized expected traces of an 
arbitrary joint moment or centered joint moment in terms 
of the distribution of matrix elements. For simplicity, only 
the i.i.d. case of one scalar probability distribution with mo- 
ments {m n } is illustrated, although this approach can be 
extended to more complicated situations as necessary. 

6. Summary 

g,We have proposed an extension of the notion of freeness 
or random matrices. 

The ideas in this paper are developed for two random 
matrices but our analysis generalizes readily to multiple ad- 
ditive free convolutions. 
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Algorithm 1 Mathematica code for calculating normalized 
expected traces of joint matrix products and centered joint 
matrix products for finite dimensional random matrices. 

NN = 100; f* Size of matrix *) 

(* The following generates the map which 
formally evaluates the expectation of the 
G random variables assuming that they 
are i.i.d. with vanishing mean. *) 
MomentsOfG := Flatten [{ 

Table [Subscripts, j]~i -> Subscript [m, i], 
{i, 2, NN}, {j, 1, NN}], 

Table[Subscript[G, j ] -> 0, {j, 1, NN}] 
}]; 

ExpectationOfG[x_] := x /. MomentsOfG; 

(* Normalized expected trace *) 
AngleBracket [x_] := ExpectationOfG[ 
Tr[x]/NN // Expand ]; 

(* centering operator *) 

c[x_] := (x - AngleBracket [x] IdentityMatrix[NN] ) 

(* Example 19 *) 

A = DiagonalMatrix[Array[Subscript[G, #] &, NN]]; 
B = SparseArray[{Band[{l, 2}] -> 1}, {NN, NN}]; 
B[[l]][[-1]] = 1; (* Add circulant boundary *) 
B = B + Transpose[B] ; 

AngleBracket [c [A. A] . c[B.B] ] 
(* Output: *) 

AngleBracket[MatrixPower[c[A] .c[B] , 4] ] 

(* Output: 2 Subscript [m, 2]~2 *) 

(* Specialized to standard Gaussian Gs *) 
GaussianG = Array[Subscript[m, #] -> 

Moment[NormalDistribution[] , #] &, NN] ; 
AngleBracket [MatrixPower [A+B] , 4]] /. GaussianG 

(* Output: 1099 *) 
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